Prediction of Fault-Prone Software Modules using Statistical and Machine Learning Methods

نویسندگان

  • Yogesh Singh
  • Arvinder Kaur
  • Ruchika Malhotra
چکیده

Demand for producing quality software has rapidly increased during the last few years. This is leading to increase in development of machine learning methods for exploring data sets, which can be used in constructing models for predicting quality attributes such as fault proneness, maintenance effort, testing effort, productivity and reliability. This paper examines and compares logistic regression and six machine learning methods (Artificial neural network, decision tree, support vector machine, cascade correlation network, group method of data handling polynomial method, gene expression programming). These methods are explored empirically to find the effect of static code metrics on the fault proneness of software modules. We use publicly available data set AR1 to analyze and compare the regression and machine learning methods in this study. The performance of the methods is compared by computing the area under the curve using Receiver Operating Characteristic (ROC) analysis. The results show that the area under the curve (measured from the ROC analysis) of model predicted using decision tree modeling is 0.865 and is a better model than the model predicted using regression

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

Comparative Analysis of Random Forests with Statistical and Machine Learning Methods in Predicting Fault-Prone Classes

There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great...

متن کامل

A metric to detect fault-prone software modules using text filtering

Machine learning approaches have been widely used for fault-prone module detection. Introduction of machine learning approaches induces development of new software metrics for fault-prone module detection. We have proposed an approach to detect fault-prone modules using the spamfiltering technique. To use our approach in the conventional fault-prone module prediction approaches, we construct a ...

متن کامل

Prediction of Fault-prone Modules Using A Text Filtering Based Metric

Machine-learning approaches have been widely used for fault-proneness detection. Introduction of machine learning approaches induces development of new software metrics for fault-prone module detection. We have proposed an approach to detect fault-prone modules using the spam-filtering technique. To treat our approach as the conventional faultprone approaches, we summarize the output of spam-fi...

متن کامل

An Approach to Early Fault Prediction in Software Systems Using K- Means Clustering

Quality of a software component can be measured in terms of fault proneness of data. Quality estimations are made using fault proneness data available from previously developed similar type of projects and the training data consisting of software measurements. To predict faulty modules in software data different techniques have been proposed which includes statistical method, machine learning m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010